Philosophical Transactions of the Royal Society B — Latest Matching Preprints

1

MeshScope-Region: Distribution, Road-Network Accessibility, and Nine-Year Evolution of ICU and HCU Capacity Across Japan's 330 Secondary Medical Areas

Ohno, K.; Hirai, M.; hashimoto, s.

2026-07-20 health informatics 10.64898/2026.07.17.26358374 medRxiv

Top 1%

0.5%

Show abstract

Background: In Japan, health planning is organized around secondary medical areas (SMAs; niji-iryo-ken; 330 areas in the 2025 classification), yet nationwide analyses of intensive care unit (ICU) capacity have been conducted mainly at the prefecture level, and a recent SMA-level study addressed only the presence or absence of ICUs. The full supply structure of intensive and intermediate critical care - ICU and high care unit (HCU) beds - has not been characterized at the SMA level with respect to its composition, road-network accessibility, and evolution over time. Methods: We developed MeshScope-Region, an analytical platform built on the Hospital Bed Function Reports (byosho-kino-hokoku) for fiscal years 2016-2024, in which ICU and HCU beds were identified from notified reimbursement categories and aggregated to SMAs. Three analytical layers were integrated: (1) cross-sectional distribution of ICU/HCU beds; (2) nationwide road-network accessibility computed with the Open Source Routing Machine (OSRM) from 176,962 populated 1-km census grid cells to all facilities reporting ICU or HCU beds; and (3) a nine-year longitudinal analysis of supply-structure types, classified by k-means (k = 6) in an 8-dimensional PCA space anchored to fiscal year 2024, with earlier years projected into the same space. Results: In fiscal year 2024, 20,631 ICU/HCU beds were reported nationally (7,114 ICU-type; 13,517 HCU-type) at 1,044 facilities. Zone-level totals among SMAs with any beds ranged 229-fold (3-688 beds); the 90th/10th percentile ratio of per-capita density was 3.6. In total, 90.1% of the population resided within 30 minutes' drive of a facility with ICU beds and 97.8% within 60 minutes; only 0.8% resided beyond 90 minutes. Although 140 of the 330 SMAs had no ICU facility within their own boundaries, 84.7% of their residents could reach an ICU facility in an adjacent area within 60 minutes' drive. Longitudinally, supply structures were highly persistent: 63.0% of SMAs (208/330) retained the same structural type across all nine years, adjacent-year rank correlations of a supply-vulnerability index were 0.887-0.924 (2016 vs. 2024: rho = 0.711), and the number of SMAs with zero ICU beds remained frozen at 133-141. The Gini coefficient of bed distribution declined from 0.384 to 0.262 - although computed on ICU-type beds alone it remained 0.365 in fiscal year 2024 - and capacity growth (total +27.9%) was driven predominantly by HCU beds (+41.6%) while ICU beds grew only +8.0%. Conclusions: Japan's critical care supply structure is regionally rigid, with a stable set of approximately 140 SMAs lacking ICU beds for nearly a decade, yet road-network accessibility substantially mitigates the consequences of zone-level absence. Recent capacity growth - and much of the apparent equalization - has occurred predominantly in intermediate care. MeshScope-Region provides a standing, reproducible evidence base at the geographic unit of Japan's medical planning cycles.

2

How Do Nurses Make Clinical Decisions Via Remote Reviews: A Convergent Mixed-Methods Study

Zhang, Y.; Sutherland, S.; GREENWAY, K.; Stayt, L.

2026-07-17 nursing 10.64898/2026.07.15.26357946 medRxiv

Top 2%

0.4%

Show abstract

Abstract Background: Remote clinical reviews have become an integral component of contemporary nursing practice across community and acute care settings. Nurses increasingly make autonomous clinical decisions using telephone, video, and online/digital systems, often with limited sensory information and under conditions of uncertainty. However, empirical understanding of how nurses make clinical decisions via remote reviews remains limited. Aim: To explore and understand how registered nurses (RNs) make clinical decisions about patient care via remote reviews. Methods: A convergent mixed-methods design was employed. Quantitative data (analytic quantitative sample N=53) were collected using validated questionnaires that measured decision-making processes, physician-nurse collaboration, decision-making stress, and perceived decision-making ability. Qualitative data (N=23) were generated through semi-structured interviews. Data collection took place between October 2024 and April 2025. Quantitative data were analysed using descriptive statistics, correlation, and multiple regression. Qualitative data were analysed using framework analysis. Integration was achieved through pillar-building and theory-driven synthesis and illustrated by joint display tables. Results: Most nurses demonstrated a flexible decision-making style, integrating analytical and intuitive reasoning. Both analytical and intuitive processes were positively associated with perceived decision-making ability. Physician-nurse collaboration emerged as a strong predictor of decision-making confidence, while decision-related stress was not a significant predictor. Qualitative findings identified three themes: characteristics of remote review; making adaptive decisions shaped by both internal and external constraints and enablers; and external influencing factors. The integrated findings informed a theory-informed ICE framework to illustrate how nurses make clinical decisions via remote reviews. Conclusion: Remote clinical decision-making is a dynamic cognitive-environmental process rather than a purely individual cognitive act. The ICE framework conceptualises this interaction, extending existing decision-making theories to digitally mediated care. Impact: Understanding remote decision-making supports training design, clinical governance, and the development of Artificial Intelligence-enhanced decision-support tools grounded in ecological bounded rationality. Patient or Public Contribution: Patient and public representatives contributed to stakeholder discussions that informed the development of the interview topic guide and the theoretical model. Patients or members of the public were not involved in recruitment, data collection, analysis, interpretation of findings, or preparation of the manuscript. Keywords: clinical decision-making, remote reviews, telehealth, nursing, mixed methods, ecological bounded rationality

3

Assessing electronic health record potential for adaptive learning in multimorbidity care in Sub-Saharan Africa: a mixed-methods study of Zimbabwe's Impilo system

Dhodho, E.; Choga, K.; Mundoga, F.; Chimberengwa, P. T.; Gongora, R. T.; Webb, K.; Chinyanga, T. T.; Banda, F.; Masiye, K.; Midzi, N.; Mudavanhu, J.; Katsidzira, A.; Manyiyo, B.; Apollo, T.; Chimbetete, C.; Mhlanga, T.; Mangisi, P.; Gwanzura, C.; Tsvangirayi, S.; Dixon, J.; Nitsch, D.

2026-07-19 health informatics 10.64898/2026.07.16.26357920 medRxiv

Top 2%

0.3%

Show abstract

Electronic health records (EHR) are increasingly recognised as critical digital infrastructure for integrated, patient-centred care in the context of rising multimorbidity. In low-resource settings, national EHRs may also support locally driven learning to improve adaptive care across chronic conditions. However, there is limited empirical evidence on whether and how these systems enable learning within routine care in ways that inform broader system adaptation. We conducted a qualitative multi-method assessment of Impilo, Zimbabwe's national EHR, to examine its capacity to support learning for integrated multimorbidity care at primary care level, using HIV-hypertension as a tracer condition pair. Guided by Friedman's socio-technical infrastructure model as the analytical framework and Learning Health Systems (LHS) theory as the interpretive framework, data were drawn from documentary review, ethnographic observation, patient journey mapping, and interviews with frontline health workers and key stakeholders. Frontline learning for person-centred multimorbidity care was actively generated through interpretation of patient trajectories, experiential adjustment, and coordination across HIV and hypertension services using both the EHR and paper-based artefacts such as registers and patient booklets. However, this learning remained largely encounter-bound and weakly stabilised. Impilo did not routinely provide usable longitudinal patient views, practice-facing analytic tools, or institutionalised mechanisms for collective reflection required to support integrated multimorbidity care. Consequently, learning was largely confined to incremental adjustment within existing workflows, with limited capacity to inform broader changes to care pathways, routines, or system design. These findings suggest that the principal barrier to developing LHS is not the absence of data or frontline learning capacity, but the lack of socio-technical arrangements that enable learning to stabilise and inform system adaptation. Digitalisation alone is insufficient to support adaptive multimorbidity care. Co-production with frontline health workers may provide a pathway for aligning digital system design with routine care realities.

4

State-dependent non-identifiability of the reproduction number under adaptive behavior: an empirical characterization from COVID-19 mobility

Sanchez, F.

2026-07-21 epidemiology 10.64898/2026.07.19.26358437 medRxiv

Top 3%

0.3%

Show abstract

The basic reproduction number R0 confounds pathogen biology with adaptive human contact behavior. Earlier epidemiological--economic theory predicted a forward-looking behavioral contact response but could not test it in the absence of appropriate behavioral data. Using directly measured mobility as an observable proxy for contact, we (i) estimate the behavioral response function directly from data; (ii) show that the biology/behavior decomposition and hence the behavioral correction to R0 is not identified from an epidemic trajectory, the apparent constant-contact R0 being one endpoint of an observational-equivalence class that fits the factual curve identically yet diverges under counterfactual; and (iii) characterize that divergence ("what R0 deletes") as state-dependent, unimodal in counterfactual severity and vanishing when behavior saturates. We then show that, across US jurisdictions, the correction is empirically bounded because risk-responsiveness and behavioral non-saturation are confounded (r=-0.57, n=51): where behavior could compensate, it was already maximal, and where it was not maximal it did not respond. What R0 deletes is thus real and structurally characterizable yet empirically modest here, for reasons the framework itself supplies.

5

Benchmarking Speech Recognition Models for Medical Consultations in Latin American Spanish: A Comparative Evaluation with Fine-Tuning

Carrillo, R. M.; Carbajal Serrano, A.; Condori Pinedo, P. S.

2026-07-16 public and global health 10.64898/2026.07.14.26358062 medRxiv

Top 3%

0.2%

Show abstract

BACKGROUND: Artificial intelligence (AI) medical scribes rely on speech-to-text (STT) models for transcription. Evaluations of STT models in non-English settings remain scarce. We benchmarked ten STT models on medical consultations from Latin American (LatAm) Spanish and assessed whether fine-tuning improves transcription accuracy. METHODS: Ten YouTube videos depicting medical consultations. Human transcriptions were the ground truth. Five open-source models were evaluated: Whisper Large, Whisper Large v3, Whisper Large v3 Turbo, Voxtral Mini 3B, and Canary 1B v2; and so were five close-source models: gpt-4o-transcribe, gpt-4o-mini-transcribe, gemini-2.5-pro, Eleven Labs, and Assembly AI. Whisper Large v3 was fine-tuned. One video was withheld from training. Performance assessed using Word Error Rate (WER), Character Error Rate (CER), BLEU Score, ROUGE-L, BERT Score, and Semantic Similarity on the one withheld video. RESULTS: None of the fine-tuning iterations outperformed the vanilla Whisper Large v3. With the withheld video, Gemini-2.5-pro was the close-source model with the best performance in four of six metrics. In comparison to the close-source models, the fine-tuned model never outperformed the other models (withheld video); conversely, in comparison to the close-source models, the fine-tuned model showed better performance across metrics, for instance: BLEU score (63% vs to 58% for the second-ranking model), BERT (89% vs to 86%), and semantic similarity (89% vs to 83%), CER (19% vs 20%). CONCLUSIONS: Whisper Large v3 and its fine-tuned variant are the best open-source STT models for transcribing medical conversations in LatAm Spanish. These findings provide an evidence base for developing AI medical scribes tailored to Spanish-speaking LatAm.

6

Comparing Human and Large Language Model Responses to Patients Online Questions: Towards Multi-dimensional Patient-centered Support

Hussein, M. A.; Doshi, R.; He, L.; Reynolds, T.

2026-07-17 health informatics 10.64898/2026.07.15.26355314 medRxiv

Top 3%

0.2%

Show abstract

Patients and caregivers seek informational and emotional support throughout medical care, especially when interpreting unfamiliar laboratory test results. Although resources such as patient portals and online health communities (OHCs) help address questions, gaps remain. The emergence of large language models (LLMs) offers the potential to be a complementary source of support to assist patients and caregivers in understanding and using their test results. The objective of our study is to empirically compare LLM responses to patients online questions containing their laboratory test results to responses written by peers in an OHC. We compared the 519 peer replies to 122 laboratory test-related posts from an OHC to 488 responses generated from four LLMs using mixed computational and qualitative methods. LLMs frequently provided clear explanations of medical terminology and structured interpretations of numeric results but were longer and less readable. Peers offered more personalized, context-specific emotional support. Overall, LLMs have the potential to complement peer responses in OHCs, but require greater emotional depth, reasoning transparency, and alignment with community norms.

7

Personality Traits, Trust, and Acceptance of Artificial Intelligence Assistive Systems: Evidence from Nigeria Population

Onah, C.; Ogwuche, C. H.; Haruna, A. I.

2026-07-17 psychiatry and clinical psychology 10.64898/2026.07.16.26358233 medRxiv

Top 4%

0.2%

Show abstract

The increasing deployment of artificial intelligence (AI) assistive systems across healthcare, education, and organisational domains necessitates a deeper understanding of dispositional factors shaping trust and acceptance. This study investigated the Big Five personality traits as predictors of trust in and acceptance of AI assistive systems among a large adult sample (N = 380) in Makurdi Benue State. Anchored in the Technology Acceptance Model (TAM) developed by Davis (1989), the study examined both direct and indirect pathways linking personality traits to AI acceptance through trust. Participants completed standardised measures of the Big Five Inventory, Trust in AI Scale, and AI Acceptance Scale. Data were analysed using structural equation modelling (SEM) with maximum likelihood estimation. The hypothesised model demonstrated good fit indices (CFI = .84, TLI = .82, RMSEA = .05). Openness to experience ({beta} = .34, p < .001) and agreeableness ({beta} = .27, p < .01) significantly predicted trust in AI systems, which in turn strongly predicted AI acceptance ({beta} = .62, p < .001). Neuroticism negatively predicted trust ({beta} = -.29, p < .001), while conscientiousness showed a modest positive direct effect on acceptance ({beta} = .18, p < .05). Extraversion was not a significant direct predictor but exerted an indirect effect through trust. Mediation analysis confirmed that trust significantly mediated the relationship between personality traits and AI acceptance. The findings underscore the centrality of dispositional traits in shaping technological trust formation and highlight the psychological architecture underlying human AI interaction. These results contribute to social psychological theory and provide empirical guidance for designing personality sensitive AI systems to enhance user adoption and sustained engagement.

8

Developing and Prospectively Validating a Reproducible Graph Representation Specification for Clinical Guideline Algorithms: The Measurement Foundation of the Clinical Guideline Complexity Index

Milani, R. V.; Bober, R. M.

2026-07-20 health informatics 10.64898/2026.07.17.26358358 medRxiv

Top 4%

0.1%

Show abstract

Background. Translating a clinical guideline decision algorithm into a computational graph requires judgment, and unconstrained coding yields divergent graphs; any complexity measure computed from such a graph inherits that variation, so its reproducibility must be demonstrated rather than assumed. Objective. To develop, and prospectively test, an empirical method for making graph extraction reproducible, using the Clinical Guideline Complexity Index (CGCI) and four guideline algorithms as a case study. Methods. We built a Graph Representation Specification (an ontology, a motif catalogue, disambiguation conventions, decomposition rules, a deterministic validator, and a scoring engine) and refined it by error-driven grammar induction: measure inter-coder disagreement, localize its dominant class, induce a single grammar rule, and prospectively test whether that rule improves agreement in the anticipated class. Reproducibility was quantified with a pre-specified, topology-based endpoint (Decision Topology Agreement) rather than edge agreement, which is oversensitive to representational choices that do not affect the score. Two trained coders independently coded the diabetes, dyslipidemia, heart-failure, and hypertension algorithms. Results. A rule induced from the diabetes comorbidity panel (assessment topology) generated a pre-specified prediction that heart-failure figures, sharing the same motif, would converge; on a fresh, independently coded pair they did, with an absolute CGCI difference of approximately one. Decision topology reproduced closely (decision-order agreement at or near 1.00 for three of four guidelines), while breadth counting was rule-sensitive: an explicit modifier-counting rule reduced the largest disagreement from 27 to 4 tokens. Residual disagreement was bounded and localizable to specific, nameable representational choices. Conclusions. Graph-extraction reproducibility can be systematically improved through iterative grammar refinement, and a prospectively derived rule can be confirmed to improve agreement. These results establish the measurement foundation (reliability, not construct validity) for a companion study interpreting CGCI as cognitive load, and the method may apply wherever graphs are extracted from structured source artifacts.

9

Tomorrow's physicians in distress: prevalence, socioeconomic gradients, and modifiable determinants of mental health problems among 1,560 medical students in Southern Brazil - a multicentre cross-sectional study

Bacchi, G. N. V.; Furukawa, L. H.; Soares, A. B.; Turatti, A. P.; Horst, E. G.; Fillmann, G. P.; dos Santos, V. B.; de Leonco, A. R.; Karpovich, E.; Pereira, V.; Teles, M. V.; Reisdorfer, K. S.; Araujo, T. F. C.; Menezes, M.; Meneguetti, H.; Firpo, I.; Martins Costa Kessler da Silveira, M. I.; Tramontina, J. F.; Pacheco, J. P. G.; Alves, L. P. d. C.; Guidolin, B. L.; Tedesco, J. T.; Bittencourt, A. M. L.; Lapa, C. d. O.; Viola, T. W.; Pinto, L.; Braquehais, M. D.; Spanemberg, L.

2026-07-16 medical education 10.64898/2026.07.14.26357843 medRxiv

Top 6%

0.1%

Show abstract

Abstract Background: Medical students carry a disproportionate burden of mental health problems, but large multicentre studies from low- and middle-income countries remain scarce, and most evidence relies on single-institution samples and odds-ratio-based analyses that overstate associations for common outcomes. We provide a comprehensive epidemiological overview of mental health among medical students across an entire Brazilian state, quantifying the burden, its co-occurrence, and its socioeconomic and academic determinants. Methods: Cross-sectional online survey of 1,560 medical students covering all 20 medical schools in operation in Rio Grande do Sul, Brazil (August-December 2023). Validated instruments assessed depressive and anxiety symptoms (PHQ-4), suicidal ideation (PHQ-9 item 9), non-suicidal self-injury, burnout (ESB-eu), quality of life (EUROHIS-QOL-8), spirituality (SSRS), substance use (ASSIST), prescription stimulant misuse, and mistreatment during training. Associations were estimated as adjusted prevalence ratios (aPR) using modified Poisson regression with robust variance, with linear trend tests, prespecified interactions, sensitivity analyses and E-values. Results: Anxiety symptoms affected 64.5% (95% CI 62.1-66.9), depressive symptoms 45.5% (43.1-48.0), burnout 49.1% (46.7-51.6), recent suicidal ideation 20.6% (18.7-22.7), lifetime self-injury 13.1% (11.6- -4.9) and stimulant misuse 7.9%; 57.4% screened positive for >=2 outcomes. Quality of life declined monotonically with cumulative burden. Low family income showed inverse gradients across five outcomes (e.g., depression: PR 0.93 per income level, p = 0.002). Four factors were independently associated with nearly all outcomes: short sleep (<6 h; aPR up to 2.73), minority sexual orientation (aPR up to 2.02), family psychiatric history, and mistreatment during training (reported by 55.6%; aPR up to 1.49). Stimulant misuse tripled from the basic cycle to clerkship (aPR 2.30, 95% CI 1.36-3.88). There was no evidence of multiplicative interaction, indicating that the risk factors acted independently on the outcomes. Estimates were robust across sensitivity analyses. Conclusions: One in two medical students in this state-wide sample screened positive for at least two mental health problems. Socioeconomic vulnerability, sleep deprivation, mistreatment and minority status operate as independent and largely modifiable determinants - actionable targets for medical schools and policymakers.

10

Where Do I Belong? Searching for fit in an unseen specialty: medical students paths to Youth Health Care

Muyselaar-Jellema, J. Z.; Könings, K. D.; van Dijk, A.; Kiefte-de Jong, J. C.; Nierkens, V.

2026-07-16 medical education 10.64898/2026.07.14.26358057 medRxiv

Top 6%

0.1%

Show abstract

Introduction: As healthcare systems increasingly shift toward prevention and community-based care, the demand for physicians in extramural specialties continues to grow. Yet, a mismatch persists between workforce needs and medical students career aspirations. Little is known about how medical students and trainees develop an interest in extramural specialties such as youth health care (YHC). This study explores trainees trajectories toward becoming a youth health care physician (YHCP). Methods: We conducted a qualitative study using semi-structured online interviews with fourteen YHCPs in training. We combined an inductive and deductive approach, applying the person-environment (PE) fit framework to explore participants evolving experiences of fit and misfit. Results: Participants described growing misfit with clinical culture of medical school (i.e. the hidden curriculum), particularly during hospital-based clerkships, combined with limited exposure to YHC. For some, this misfit extended to doubts about becoming a doctor. Over time, participants developed a sense of fit and belonging within YHC, either directly or after exploring other specialties including extramural specialties. Discussion: These findings reframe specialty choice as a longitudinal search for belonging and alignment, in which trainees iteratively explore, evaluate, and refine their sense of fit across contexts. Clerkships serve as key sites for testing fit, yet also expose learners to the clinical culture, including the hidden curriculum. Broadening exposure and supporting reflective fit processes may encourage more medical students to choose extramural specialties, ultimately fostering a more balanced and sustainable alignment of the medical workforce.

11

Design tensions in a two-sided marketplace for reusable digital therapeutics software components: a qualitative interview study

Kowatsch, T.; Melamed, S.; Nissen, M.; Merz, Y.

2026-07-20 health informatics 10.64898/2026.07.17.26358332 medRxiv

Top 6%

0.1%

Show abstract

Objectives To identify stakeholder-perceived design tensions in a two-sided marketplace for reusable digital therapeutics (DTx) software components and to use these tensions to propose alternative marketplace concepts. Methods We conducted 24 semi-structured interviews with digital health researchers and professionals. Data were analysed using hybrid deductive-inductive codebook thematic analysis. The Magic Triangle provided the initial deductive structure. One researcher coded all transcripts; a second independently applied the developing codebook to five transcripts to refine definitions and consistency. Seventeen parent themes were synthesized into 12 design tensions, which informed three author-generated marketplace concepts. Results Participants described trade-offs concerning target users and host, component scope and customization, quality labels, verification, geographic scope, pricing, interoperability, platform launch, risks and market niche. The resulting concepts emphasized a regional startup ecosystem, a research-oriented hybrid marketplace or a global marketplace with stricter entry requirements. Discussion The concepts combine the tensions in different ways and highlight competing priorities in governance, openness, assurance, scalability and early platform growth. Conclusion Stakeholders identified recurring design choices for a DTx software-component marketplace. The concepts provide hypotheses for prototyping and evaluation; the study did not test technical feasibility, market demand, regulatory acceptability or effects on development cost or time.

12

Understanding Mental Health in Crisis: Key Determinants of Psychological Distress in Belgium during the first weeks of COVID-19 Lockdown

Zsabokorszky, Z.; Pepermans, K.; Van Den Broeck, K.; Beutels, P.; Hens, N.

2026-07-16 public and global health 10.64898/2026.07.15.26358143 medRxiv

Top 6%

0.1%

Show abstract

Aims: The COVID-19 pandemic has significantly impacted global mental health. At the onset of the pandemic (2020), Belgians experienced increased anxiety, depression, and psychological distress compared to 2018 due to the outbreak and the associated public health measures. Understanding the drivers of this distress is crucial for mitigating mental health effects in future crises. This study examines determinants of psychological distress in Belgium during the March 2020 lockdown, using data from the Great Corona Study (GCS). Methods: Data were drawn from the second wave of the GCS, a citizen science initiative conducted in Belgium on March 24, 2020, with 332,169 respondents. Psychological distress was measured using the General Health Questionnaire-12 (GHQ-12), applying a 2/3 cutoff to classify distress levels. To identify predictor variables, a random forest algorithm and literature review reduced 207 initial variables to 16. A generalized linear model was then used to examine associations between predictors and psychological distress Results: Psychological distress was significantly associated with various demographic, social, occupational, and health-related factors. Younger individuals, women, and residents of Wallonia or Brussels exhibited higher odds of distress. Household composition, and the frequency of real-life social interactions significantly influenced distress levels. Occupational status played a key role, with part-time employees and working students exhibiting higher levels of distress. At the same time retired individuals with no current occupation showed lower odds. Perceived workplace safety and compliance with public health measures also significantly impacted distress levels. Lastly, individuals experiencing influenza-like or COVID-19 symptoms had substantially higher odds of psychological distress. Conclusions: Our findings highlight significant sociodemographic, occupational, and health-related predictors of psychological distress during the initial COVID-19 lockdown in Belgium. Young adults, women, individuals with limited in-person interactions, and those experiencing influenza-like illness or COVID-19 symptoms were particularly vulnerable. Additionally, perceptions of others' adherence to preventive measures played a crucial role in mental well-being. These results highlight the complex interplay between individual and environmental factors in shaping psychological distress, providing valuable insights for future public health policies and mental health interventions during crises.

13

The Shape of a Final Message: An Emotional Landscape in the Language of Suicide

Pestian, J. P.; Jacobson, D. A.; Pedapati, E. V.; Mendonca, E. A.; McMahon, B. H.; Ive, J.; Glauser, T. A.

2026-07-17 psychiatry and clinical psychology 10.64898/2026.07.16.26358230 medRxiv

Top 6%

0.1%

Show abstract

The emotional content of suicide notes is typically examined using categorical coding, where each labeled passage is treated in isolation from its surrounding language. In contrast, dimensional models of psychopathology propose that affective content varies along continuous gradients. We evaluated this proposition directly. Excerpts from 884 annotated suicide notes were embedded in a semantic space defined solely by their linguistic properties, and we investigated whether human-assigned emotion labels changed smoothly across this space. They did: affective tone showed clear spatial autocorrelation (Moran's $I = 0.18$, $z = 19.68$, $p < 0.001$), an effect that replicated across three different encoders and remained after removing all within-note dependencies. Emotions occupied recognizable yet overlapping regions rather than forming distinct clusters and varied substantially in how tightly they were concentrated: love and hopelessness appeared with similar frequency, but love was far more localized ($z = 15.7$ versus $10.8$). Among all emotions, hopelessness was the most linguistically diffuse, implying that a single categorical label is capturing multiple, qualitatively different manifestations of suicidal distress.

14

Efficient stochastic epidemic simulation via the Sellke construction

van Boven, M.; Bootsma, M. C.

2026-07-17 epidemiology 10.64898/2026.07.16.26358219 medRxiv

Top 6%

0.1%

Show abstract

Stochastic epidemic models are a cornerstone of infectious disease epidemiology and are often used to study intervention scenarios. However, large run-to-run variability can make intervention effects difficult to estimate precisely. We revisit the epidemic Sellke construction, which assigns each individual an infection threshold for the cumulative infection hazard such that, conditional on the thresholds, the epidemic trajectory becomes deterministic. This enables coupling of simulations with and without an intervention, yielding low-variance effect estimates even when outcomes such as final size or peak incidence vary widely between runs. We develop an exact, event-driven implementation that maintains infection and recovery events in priority queues. Cumulative infection-hazard updates require O(log N) time per event, yielding overall complexity O(Elog N) for E events in a population of size N. The implementation achieves computational performance comparable to the classical Gillespie algorithm while naturally accommodating non-Markovian infectious periods and complex infectiousness profiles. We illustrate the approach using distance-dependent spread of avian influenza between poultry farms in the Netherlands and a multilayer population with households, schools, and workplaces. In both examples, coupling enables efficient within-run comparisons of intervention scenarios across stochastic realisations.

15

Ceasing oxytocin in the active phase of the first stage of induced labours: A prospective audit at a tertiary hospital.

O'Dea, S.; De Vries, B.; Balendran, J.; Davis, G.; Phipps, H.; O'Brien, K.

2026-07-20 obstetrics and gynecology 10.64898/2026.07.17.26358359 medRxiv

Top 6%

0.1%

Show abstract

Introduction: Oxytocin is commonly used in the process of induction of labour and is associated with uterine hyperstimulation and abnormal fetal heart rate patterns that can increase the risk of adverse perinatal outcomes. Cessation of oxytocin in the active phase of induced labour has been shown in randomised trials to reduce uterine tachysystole and abnormal fetal heart rate traces, and may reduce caesarean section. We introduced a policy recommending cessation of oxytocin infusion in the active phase of the first stage of induced labour at a tertiary hospital in Sydney, Australia, and collated both clinical outcomes and maternal satisfaction following implementation. Methods: This was a prospective audit of a policy change at Royal Prince Alfred Hospital, comparing 600 women induced with oxytocin in the 6 months before the policy (November 2019 to May 2020) with 556 women induced in the 6 months after implementation (June to December 2020). Eligible women had a cervix [≥] 5cm, an oxytocin infusion, and regular uterine contractions. The primary clinical outcome was caesarean delivery. The primary patient-centred outcome, maternal satisfaction, measured using the Six Simple Questions questionnaire, was collected in a subset of participants. Secondary outcomes included mode of birth, length of labour, uterine hyperstimulation, and perinatal outcomes. Results: Caesarean delivery occurred in 29% of women before and 28% after policy implementation (p=0.77). Instrumental birth increased from 25% to 27%; and instrumental birth for maternal indications increased from 6.8% to 13% (p=0.0005). Median length of labour increased by one hour (5.4 vs 6.4 hours, p=0.006). Oxytocin was ceased for at least two hours or until birth in 13% of women before the policy versus 35% after. Maternal satisfaction scores were modestly lower after implementation (median 41 vs 38, p=0.03). Perinatal outcomes, including abnormal cord gases, Apgar scores, and NICU admission, were similar between groups. Conclusions: Implementing a policy of recommending cessation of oxytocin in the active phase of induced labour did not reduce caesarean delivery rates in a real-world tertiary hospital setting, despite trial-level evidence supporting the intervention. Poor uptake, negative staff perceptions, and a modest reduction in maternal satisfaction highlight barriers to translating trial efficacy into routine clinical practice. Adequately powered trials are needed to clarify optimal protocols for oxytocin cessation and its effects on maternal and perinatal outcomes.

16

How bursty infectiousness shapes epidemic dynamics

Kissler, S. M.

2026-07-17 epidemiology 10.64898/2026.07.15.26358199 medRxiv

Top 7%

0.1%

Show abstract

An epidemic's expected course is determined by the magnitude and timing of a typical person's infectiousness --- captured, in turn, by the basic reproduction number and the generation-time distribution. These fundamental, population-average quantities can mask individual-level variation that shapes how an epidemic actually unfolds: for example, individual variation in the magnitude of infectiousness (overdispersion) creates superspreading, a key feature of the SARS-CoV-1 and SARS-CoV-2 epidemics. However, the impact of individual variation in infectiousness timing is less well understood. Here, we demonstrate that individual infectiousness timing varies substantially and to different degrees across pathogens. For some common pathogens, including influenza, measles, and SARS-CoV-2, infectiousness is "bursty", or highly concentrated and variably-timed across individuals: for example, the window of appreciable infectiousness for SARS-CoV-2 may last for roughly a day, vs. the 9--12 days usually quoted. We show that bursty infectiousness creates superspreading without inherent superspreaders, makes epidemic timing more variable, amplifies the time-sensitivity of common interventions, and complicates inference of key epidemiological parameters. Together with the reproduction number, the generation-time distribution, and overdispersion, burstiness completes a family of basic parameters that govern how epidemics unfold.

17

Are CNV Risk Scores Linked to Neurodevelopmental and Mental Health Characteristics Within CNV-Associated Intellectual Disability?

Chi, Z.; Alexander-Bloch, A.; Neufeld, S. A.; Wolstencroft, J.; Skuse, D.; IMAGINE-ID consortium, ; Baker, K.

2026-07-16 psychiatry and clinical psychology 10.64898/2026.07.14.26358034 medRxiv

Top 7%

0.1%

Show abstract

Background: Children and young people (CYP) with intellectual disability (ID) frequently have co-occurring neurodevelopmental (ND) and mental health (MH) difficulties. While copy number variants (CNVs) are identified as an important aetiology of ID, it is unclear whether and how CNV risk scores predict ND and MH characteristics within the CNV-associated ID population. Methods: We analysed data from the UK-based IMAGINE-ID cohort of CYP (aged 4-19 years) with ID and clinically-reported CNVs (N = 1,640). CNVs were annotated with Gencode 19 in ENSEMBL to calculate CNV risk scores, including summed probability of loss-of-function intolerance (pLI) and dosage sensitivity. Multivariate regression models examined the prediction of CNV variables and inheritance on ND and MH characteristics, assessed via the Development and Well-Being Assessment (DAWBA). Post-hoc analyses explored CNV variable stratification (lower vs. higher range pLI). Results: Higher summed pLI scores (indexing CNV genes' intolerance to loss of function) unexpectedly predicted fewer MH difficulties and a lower likelihood of ND diagnoses, even after accounting for demographic factors and CNV inheritance. Post-hoc analyses identified a threshold effect. Within the lower pLI range, higher pLI scores were associated with greater MH difficulties, consistent with findings from population-based samples. In contrast, within the higher pLI range, higher pLI scores were associated with fewer MH difficulties (among individuals more likely to have severe ID). Conclusion: These findings challenge the assumption that CNV genomic "risk scores" universally predict ND and MH difficulties. Instead, within CNV-associated ID, complex relationships exist between CNV risk scores, inheritance and phenotypes. These insights emphasise the necessity of integrating genomic results with familial and developmental context to understand individual vulnerabilities and support needs.

18

Selective prediction as a triage gate for primary-care depression screening: quantifying and mitigating selection bias in CHARLS-2011

Wang, Z.; liu, y.

2026-07-20 health informatics 10.64898/2026.07.17.26357845 medRxiv

Top 8%

0.1%

Show abstract

Background Primary care in China lacks structured mental-health assessment, and the machine-learning models that could support such screening are typically developed on heavily selected samples. Cumulative inclusion and exclusion criteria, though usually treated as neutral data-cleaning steps, can create heterogeneity in predictive reliability among retained participants. Using the China Health and Retirement Longitudinal Study (CHARLS) 2011 baseline, we quantified how selection funnels distort epidemiological associations and inflate machine-learning metrics, and tested selective prediction as mitigation. Methods Using the CHARLS 2011 baseline with temporal external validation in CHARLS-2018, we built a four-level selection funnel (L0-L3), evaluated five classifiers with nested cross-validation and SMOTE, and compared model-embedded uncertainty with a decoupled predictor-selector framework; XGBoost cross-validation residuals drove risk stratification and classification and regression tree (CART) rules. Results Sample sizes fell from L0 n=17,705 to L3 n=4,256 (24.0%). The cancer-depression odds ratio attenuated from 1.78 (95% CI 1.32-2.41) to 1.39 (0.74-2.63), losing significance. AUC rose with selection but not after multiple-comparison correction, whereas calibration error increased for four of five models. Model-embedded uncertainty succeeded only for XGBoost; with the decoupled XGBoost residual selector, all five models achieved selective prediction at approximately 20% coverage (test AUC 0.90, 95% CI 0.85-0.95), abstaining on approximately 80% of cases for individual safety. Risk stratification was stable (residual Spearman correlations >0.95; multi-seed Jaccard 0.88), and CART rules used self-rated health, education, pain, and marital status. Conclusions The findings support a deployable primary-care triage pathway: a four-variable rule identifies patients suitable for algorithm-assisted scoring (approximately 20% coverage) and routes the remainder to human evaluation. Methodologically, cumulative selection bias produces a dual distortion: epidemiological associations are compressed and machine-learning metrics inflated. Selective prediction is limited mainly by uncertainty-indicator design. Performance metrics should be reported with selection level, coverage, and calibration trajectory. Decoupled selective prediction with CART rule extraction provides an actionable framework for quality-controlled, tiered-care deployment. Keywords: selective prediction, selection bias, CHARLS, depression, predictor-selector decoupling, uncertainty quantification, classification and regression tree, triage, clinical decision support, health management.

19

Bayesian shared-component spatiotemporal modeling of sexually transmitted infection co-occurrence: identifying geographic vulnerability across 204 countries, 1990-2023

Ma, Q.; Zhang, T.; Lin, D.; Zou, W.

2026-07-21 epidemiology 10.64898/2026.07.19.26358422 medRxiv

Top 8%

0.1%

Show abstract

Objectives: Although HIV incidence has declined in some settings, the overall global burden of sexually transmitted infections remains a major public health concern. In the context of the World Health Organization's call for people-centred STI prevention and care, identifying the shared geographic pattern of multiple STIs using data-driven analysis may help detect vulnerable areas and inform integrated prevention strategies. Methods: We analysed country-level incidence counts from the Global Burden of Disease 2023 study for 204 countries and territories over 1990-2023. A Bayesian shared-component spatiotemporal model was fitted, decomposing each disease's log-rate into a shared spatial component (scaled intrinsic conditional autoregressive prior), disease-specific spatial deviations, disease-specific first-order random walk temporal effects, and five socioeconomic covariates, with a negative binomial likelihood to accommodate overdispersion. The shared spatial score - the posterior mean of the shared spatial component - was used as a continuous index of STI co-occurrence burden. Posterior exceedance probabilities quantified directional stability. External validity was assessed via Spearman correlation with the Socio-demographic Index and generalised estimating equation regression of HIV/AIDS mortality on the shared score. Results: The shared spatial score exhibited marked geographic heterogeneity. The five highest-scoring countries were Eswatini (2.25), Lesotho (2.13), Malawi (1.90), Mozambique (1.89), and South Africa (1.85), all in southern Africa. Fifty-seven countries had high directional stability (posterior exceedance probability >0.95), concentrated in sub-Saharan Africa and the Caribbean. The score correlated negatively with SDI (Spearman rho = -0.619, p = 6.4 x 10^-23) and positively with HIV/AIDS mortality (incidence rate ratio = 14.64 per standard deviation, 95% CI: 11.90-18.01). Prior sensitivity analysis confirmed near-perfect ranking stability (rho >= 0.9999). Conclusions: STI co-occurrence is geographically concentrated, with the highest shared burden in sub-Saharan Africa and persistently elevated shared spatial signals also observed in parts of mainland Southeast Asia and the Caribbean. The shared spatial score provides a data-driven tool for prioritising integrated STI screening and prevention resources across countries.

20

Diagnostic analytics of routine Clinical Competency Committee data of six cohorts in family medicine program in the UAE, utilizing Milestones, EPA, and ITE

Baynouna Alketbi, L. M.; Nagelkerke, N.; Alzarouni, A.; AlKwuiti, M.

2026-07-16 medical education 10.64898/2026.07.13.26356644 medRxiv

Top 9%

0.1%

Show abstract

In Competency-based medical education (CBME), longitudinal data is generated continuously. The judgments a Clinical Competency Committee (CCC) makes about trainee learning and performance are a valuable resource, supporting both resident and program development. Such data as well can enables the evaluation of rating quality and of CBME instruments such as Milestones and Entrustable Professional Activities (EPAs) which can help address a gap in the CBME literature, where evidence on the performance of these instruments remains limited. Objective Routinely gathered CCC data of six cohorts in a four-training ACGME-I-accredited family medicine residency in Al Ain, United Arab Emirates, was studied to describe growth trajectories, rating-system behavior, and the concurrent agreement of CBME instruments. As well as investigating the prospective predictive validity of two CBME instruments, EPA and Milestones, and the In-Training Exam (ITE). Methods The longitudinal CCC data for 80 residents across six cohorts (2019-20 to 2024-25) were assessed at up to eight time points (mid- and end-year; R1-R4). The pooled dataset included 10,458 EPA item ratings across 334 resident-time points, 5,021 Competency Milestone item ratings across 285 resident-time points, and 185 ITE scores. Five research questions were examined: growth trajectories; within- and between-resident variation and straight-lining (identical scores on assessed items at a single time point); EPA-Milestones agreement; the validity of supervisor ratings against the ITE (anchoring diagnostic, same-year correlations, prospective regressions); and EPA blueprint fidelity (the mapping of EPAs against the ACGME-I subcompetency). Al Ain trajectories were benchmarked against an international family medicine reference. Results All three instruments rose steadily across the eight timepoints. By End R4, the Milestones mean (4.00, range 3.83-4.24) matched US end-of-training norms (3.84-4.02). With regards to rating quality, pooled R1-R3 Milestones straight-lining was 2.3% (EPA 0%), below US benchmarks; between-resident discrimination was preserved (SD 0.41-0.54); and longitudinal halo was ruled out (within-domain growth-slope r = 0.61 vs across-domain r = 0.37). End R1 Overall EPA was the strongest prospective predictor of Final Competency (B = 0.96, p < 0.001) and Final ITE (B = 96.88, p = .006). Medical Knowledge ratings were independent of prior ITE scores from Mid R2 onward, and End R2 MK ratings predicted ITE 17 months later at r=0.88, confirming supervisor judgment was not anchored to test results. With regards whether individual EPAs correlate with individual Milestone subcompetencies at each timepoint, a significant EPA and Milestones correlations were negligible at End R1 (1 of 222 item-level cells significant) and converged by End R3 (36 cells), while resident-mean stepwise regressions showed the two instruments (EPA and Milestones) behaved as overlapping predictors throughout, indicating that EPAs and Milestones are complementary at the level of specific content but convergent at the level of aggregate resident judgment. Blueprint fidelity rose from 30% of cells reaching r [≥] 0.40 at End R2 to 80% at End R3 in the same cohort, indicating that apparent fidelity is materially affected by measurement timing. Conclusion By graduation, residents demonstrated substantial and progressive competency achievement across both instruments, with the majority reaching the entrustable threshold on both EPA and Milestone ratings. The rating system demonstrated disciplined assessment behavior of supervisors and both concurrent and prospective validity relative to the ITE. Overall EPA at End R1 was the strongest prospective predictor of all three terminal outcomes, final ITE score, graduating Competency Milestones, and graduating overall EPA, outperforming Milestones and baseline knowledge. Routine CCC data support an evidence-based quality assurance framework spanning rater-process diagnostics, outcome-validity diagnostics, and the asymmetric-instrument diagnostic, requiring no additional data collection beyond existing program processes.